7 research outputs found

    New Algorithms for Predicting Conformational Polymorphism and Inferring Direct Couplings for Side Chains of Proteins

    Get PDF
    Protein crystals populate diverse conformational ensembles. Despite much evidence that there is widespread conformational polymorphism in protein side chains, most of the xray crystallography data are modelled by single conformations in the Protein Data Bank. The ability to extract or to predict these conformational polymorphisms is of crucial importance, as it facilitates deeper understanding of protein dynamics and functionality. This dissertation describes a computational strategy capable of predicting side-chain polymorphisms. The applied approach extends a particular class of algorithms for side-chain prediction by modelling the side-chain dihedral angles more appropriately as continuous rather than discrete variables. Employing a new inferential technique known as particle belief propagation (PBP), we predict residue-speci c distributions that encode information about side-chain polymorphisms. The predicted polymorphisms are in relatively close agreement with results from a state-of-the-art approach based on x-ray crystallography data. This approach characterizes the conformational polymorphisms of side chains using electron density information, and has successfully discovered previously unmodelled conformations. Furthermore, it is known that coupled uctuations and concerted motions of residues can reveal pathways of communication used for information propagation in a molecule and hence, can help in understanding the \allostery" phenomenon in proteins. In order to characterize the coupled motions, most existing methods infer structural dependencies among a protein's residues. However, recent studies have highlighted the role of coupled side-chain uctuations alone in the allosteric behaviour of proteins, in contrast to a common belief that the backbone motions play the main role in allostery. These studies and the aforementioned recent discoveries about prevalent alternate side-chain conformations (conformational polymorphism) accentuate the need to devise new computational approaches that acknowledge side chains' roles. As well, these approaches must consider the polymorphic nature of the side chains, and incorporate e ects of this phenomenon (polymorphism) in the study of information transmission and functional interactions of residues in a molecule. Such frameworks can provide a more accurate understanding of the allosteric behaviour. Hence, as a topic related to the conformational polymorphism, this dissertation addresses the problem of inferring directly coupled side chains, as well. First, we present a novel approach to generate an ensemble of conformations and an e cient computational method to extract direct couplings of side chains in allosteric proteins. These direct couplings are used to provide sparse network representations of the coupled side chains. The framework is based on a fairly new statistical method, named graphical lasso (GLASSO), iii devised for sparse graph estimation. In the proposed GLASSO-based framework, the sidechain conformational polymorphism is taken into account. It is shown that by studying the intrinsic dynamics of an inactive structure alone, we are able to construct a network of functionally crucial residues. Second, we show that the proposed method is capable of providing a magni ed view of the coupled and conformationally polymorphic side chains. This model reveals couplings between the alternate conformations of a coupled residue pair. To the best of our knowledge, this is the rst computational method for extracting networks of side chains' alternate conformations. Such networks help in providing a detailed image of side-chain dynamics in functionally important and conformationally polymorphic sites, such as binding and/or allosteric sites. This information may assist in new drug-design alternatives. Side-chain conformations are commonly represented by multivariate angular variables. However, the GLASSO and other existing methods that can be applied to the aforementioned inference task are not capable of handling multivariate angular data. This dissertation further proposes a novel method to infer direct couplings from this type of data, and shows that this method is useful for identifying functional regions and their interactions in allosteric proteins. The proposed framework is a novel extension of canonical correlation analysis (CCA), which we call \kernelized partial CCA" (or simply KPCCA). Using the conformational information and uctuations of the inactive structure alone for allosteric proteins in the Ras and other Ras-like families, the KPCCA method identi ed allosterically important residues not only as strongly coupled ones but also in densely connected regions of the interaction graph formed by the inferred couplings. The results were in good agreement with other empirical ndings and outperformed those obtained by the GLASSO-based framework. By studying distinct members of the Ras, Rho, and Rab sub-families, we show further that KPCCA is capable of inferring common allosteric characteristics in the small G protein super-family

    Bayesian Optimization Algorithm for Non-unique Oligonucleotide Probe Selection

    Get PDF
    One important application of DNA microarrays is measuring the expression levels of genes. The quality of the microarrays design which includes selecting short Oligonucleotide sequences (probes) to be affixed on the surface of the microarray becomes a major issue. A good design is the one that contains the minimum possible number of probes while having an acceptable ability in identifying the targets existing in the sample. We focuse on the problem of computing the minimal set of probes which is able to identify each target of a sample, referred to as Non-unique Oligonucleotide Probe Selection. We present the application of an Estimation of Distribution Algorithm named Bayesian Optimization Algorithm (BOA) to this problem, and consider integration of BOA and one simple heuristic. We also present application of our method in integration with decoding approach in a multiobjective optimization framework for solving the problem in case of multiple targets in the sample

    A Comparative Study of Cluster Detection Algorithms in Protein–Protein Interaction for Drug Target Discovery and Drug Repurposing

    Get PDF
    The interactions between drugs and their target proteins induce altered expression of genes involved in complex intracellular networks. The properties of these functional network modules are critical for the identification of drug targets, for drug repurposing, and for understanding the underlying mode of action of the drug. The topological modules generated by a computational approach are defined as functional clusters. However, the functions inferred for these topological modules extracted from a large-scale molecular interaction network, such as a protein–protein interaction (PPI) network, could differ depending on different cluster detection algorithms. Moreover, the dynamic gene expression profiles among tissues or cell types causes differential functional interaction patterns between the molecular components. Thus, the connections in the PPI network should be modified by the transcriptomic landscape of specific cell lines before producing topological clusters. Here, we systematically investigated the clusters of a cell-based PPI network by using four cluster detection algorithms. We subsequently compared the performance of these algorithms for target gene prediction, which integrates gene perturbation data with the cell-based PPI network using two drug target prioritization methods, shortest path and diffusion correlation. In addition, we validated the proportion of perturbed genes in clusters by finding candidate anti-breast cancer drugs and confirming our predictions using literature evidence and cases in the ClinicalTrials.gov. Our results indicate that the Walktrap (CW) clustering algorithm achieved the best performance overall in our comparative study
    corecore